Pointers with Restricted Aliasing
Introduction
The main benefit of the regions described thus far is also their drawback: to free data you must free an entire region. This implies that to amortize the cost of creating a region, one needs to allocate into it many times. Furthermore, the objects allocated in a region should be mostly in use until the region is freed, or else memory will be wasted in the region that is unused by the program.
In an attempt to alleviate each of these problems Cyclone provides a mechanism by which individual objects in a region may be freed prior to freeing the entire region. Pointers to objects that may be freed early must obey an aliasing discipline to prevent dangling pointers. A static analysis ensures that such objects can only ever be accessed through one pointer at any time. At the time it is freed, this pointer is invalidated, thus preventing all future accesses to the object.
Pointer types in Cyclone can be qualified by their aliasability. As of
now, there are four alias qualifiers. Each of these qualifiers must
appear as arguments to the @aqual(
…)
pointer qualifier mentioned in
section Pointers. The four alias qualifiers are:
Aliasable Pointer types are by default qualified as
ALIASABLE
. These may be freely aliased, but can never be freed. For
instance, int @@aqual(ALIASABLE)
is an aliasable non-null pointer to
an int
.
Unique The UNIQUE
qualifier on a pointer allows the object pointed
to by that pointer to be deallocated individually, using the function
rufree
. For freeing objects to be safe, all accesses to such objects
must be made through a single UNIQUE
pointer. That is, only a single
pointer may be used to access the object at any given time; this
trivially guarantees that if the object is freed through its unique
pointer, no other access to the object beyond that point is
possible. Objects that become unreachable but are not freed manually
will be freed by the garbage collector (assuming it’s not removed with
-nogc
). For instance, int ?@aqual(UNIQUE)
is a unique fat pointer to
an int
.
Reference-counted The REFCNT
qualifier reference-counted qualifier
also permits freeing individual objects. Unlike the UNIQUE
qualifier,
multiple pointers to a single object are permitted, the number of
which is tracked dynamically via a hidden reference count stored with
the object. Additional pointers to an object are created explicitly
via a call to alias_refptr
, which increases the reference
count. Individual pointers are removed via a call to rdrop_refptr
;
when the last pointer is removed (i.e., the reference count is 0), the
object is freed. Like the unique region, objects that become
unreachable will be freed by the garbage collector. For instance, int
*@aqual(REFCNT)
is a nullable pointer to a reference counted int
.
Restricted All the alias qualifiers are arranged in a subtyping
hierarchy. The RESTRICTED
qualifier is a super-type of all the other
qualifiers. A RESTRICTED
pointer may not be freed nor can any aliases
of the pointer be created. For instance, int *@aqual(RESTRICTED)
is a
restricted nullable pointer to an int
. The subtyping hierarchy is as
below:
RESTRICTED
/ | \
/ | \
/ | \
ALIASABLE UNIQUE REFCNT
Allocation and Freeing
Unique and reference-counted pointers can be allocated in any region. However, to support freeing individual objects from a region we use a special allocator (see Reap Allocator Implementation) that maintains additional metadata. We use the term “reap” (= region + heap) to refer to regions from which individual objects can be deleted. Any region can contain objects that can be individually freed. Thus all regions are in fact reaps.
The allocation mechanism for unique and reference-counted pointers
follows a pattern similar to that used for normal region allocation
described in the previous section. The allocation functions rely on
handles for alias qualifiers which are of type
aqual_t<`q::Q>
. Here `q is a type variable of alias qualifier
kind and may be instantiated with the alias qualifiers above. The
Cyclone libraries provide the following global variables:
aqual_t<ALIASABLE> Core::aliasable_qual
aqual_t<UNIQUE> Core::unique_qual
aqual_t<REFCNT> Core::refcnt_qual
which are handles for the aliasable, unique and reference-counted qualifiers respectively. Note that there is no handle for the restricted qualifier.
The following expressions are used for allocation:
qnew(
aq-identifier)
exprqnew(
aq-identifier)
array-initializer
The qnew
functions allocate in the `H region, i.e., the heap. They
expect an alias qualifier handle. For instance, to allocate a unique
integer in the heap, one might use
int @@aqual(UNIQUE) a = qnew(Core::unique_qual) 0;
rnew(
rgn-identifier,
aq-identifier)
exprrnew(
rgn-identifier,
aq-identifier)
array-initializer
To allocate an alias-free object in a region other than the heap the rnew functions can be used. For instance,
{
region r;
int @@aqual(UNIQUE) `r a = rnew(r, Core::unique_qual) 0;
}
If aq-identifier is not specified it defaults to the ALIASABLE
handle.
qmalloc(
aq-identifier,sizeof(
type))
qcalloc(
aq-identifier,n,sizeof(
type))
Similar to qnew
these functions allocate in the heap.
rmalloc(
rgn-identifier,
aq-identifier,sizeof(
type))
rcalloc(
rgn-identifier,
aq-identifier,n,sizeof(
type))
Similar to rnew
these functions can be used to allocate alias
restricted pointers in the specified region. If aq-identifier is
omitted, it defaults to ALIASABLE
.
Since there is no handle for the restricted qualifier, RESTRICTED
qualified pointers cannot be allocated. As with normal region
allocation, we use very simple type inference to ease the burden of
writing qualifiers on types. For instance,
int *a = rnew(r, Core::unique_qual) 0;
suffices to allocate a unique integer in the region r
.
The Cyclone library provides the following functions to free alias-free pointers.
Core::ufree(
ptr)
To free a heap directed unique pointerufree
is used.Core::rufree(
rgn-handle,
ptr)
To free a unique pointer from a reaprufree
is used. A constraint on the effect qualifiers that appear on the arguments of this function is described in Subtyping for Effect Qualifier and Reaps.Core::drop_refptr(
ptr)
To reduce the reference count on a heap resident reference counted objectdrop_refptr
is used. The ptr can no longer be dereferenced following this call. If the reference count on the object reaches zero then the object is freed automatically.Core::rdrop_refptr(
rgn-handle,
ptr)
To reduce the reference count on a reference counted object in an arbitrary regionrdrop_refptr
is used.
Unique Pointers
To ease programming with unique pointers and allow reuse of library
code, unique pointers can be aliased temporarily within a designated
lexical scope using a special alias pattern. If this kind of aliasing
is not sufficient, users can choose to allocate reference-counted
objects; this idea is explained in the next subsection. We also define
syntax a :=: b
to allow two unique pointers a
and b
to be
atomically swapped. Careful use of the swap operator allows us to
store unique pointers in objects that are not themselves uniquely
pointed to. We also introduce bounded polymorphism over alias
qualifiers and add an additional kind to specify type variables over
these qualifiers. In practice, all of these mechanisms are necessary
for writing useful and reusable code.
Simple Unique Pointers
Having a unique pointer ensures the object pointed to is not reachable
by any other means. When pointers are first allocated, e.g., using
malloc
, they are unique. Such pointers are allowed to be read
through (that is, dereferenced or indexed) but not copied, as the
following example shows:
char @fat @aqual(UNIQUE) buf = malloc(3*sizeof(char));
buf[0] = 'h';
buf[1] = 'i';
buf[2] = '\0';
...
ufree(buf);
This piece of code allocates a unique buffer, and then indexes that buffer three times to initialize it. Because the process of storing to the buffer does not copy its unique pointer, it can be safely freed.
When a unique pointer is copied, e.g., when passed as a parameter to a function or stored in a datastructure, we say it has been consumed. We ensure that consumed pointers are not read through or copied via a dataflow analysis. When a consumed pointer is assigned to, very often it can be unconsumed, making it accessible again. Here is a simple example that initializes a datastructure with unique pointers:
1 struct pair { int *@aqual(UNIQUE) x; int *@aqual(UNIQUE) y; } p;
2 int *@aqual(UNIQUE) x = new 1; // initializes x
3 p.x = x; // consumes x
4 x = new 2; // unconsumes x
5 p.y = x; // consumes x
If an attempt was made to read through or copy x between lines 3 and 4 or after line 5, the flow analysis would reject the code, as in
int *@aqual(UNIQUE) x = new 1; // initializes x
p.x = x; // consumes x
p.y = x; // rejected! x has been consumed already
When a multi-word data structure is assigned to another one, all of the unique pointers it contains are consumed. For example:
1 struct pair { int *@aqual(UNIQUE) x; int *@aqual(UNIQUE) y; } p, q;
2 p.x = new 1; p.y = new 2;
3 q = p; // consumes p.x and p.y
By default, when a unique pointer is passed to a function, we expect that the function will not free the pointer, store it in a data structure, or otherwise make it unavailable to the caller. If a function violates any of thse assumptions its type must be augmented with the attribute consume, which indicates that a particular argument is no longer available to the caller after the call. For example:
void foo(int *@aqual(UNIQUE) x) __attribute__((consume(1))) {
ufree(x);
}
void bar() {
int *@aqual(UNIQUE) x = malloc(sizeof(int));
foo(x);
*x;//<--- this dereference is not allowed
}
Here, the consume(1)
attribute in the definition of foo
indicates that
the first argument is consumed within the function body.
Note that if you fail to free a unique pointer, it will eventually be garbage collected.
Unique pointers have some restrictions. In particular:
- No pointer arithmetic is allowed on unique pointers. This ensures that all unique pointers point to the beginning of the object, so that the allocator is not confused when a pointer is passed to ufree.
- Take the address of a unique pointer is not allowed. This is because doing so effectively creates an alias to the original pointer that cannot be easily tracked statically.
- Unique pointers cannot be stored within datatypes (though they can be stored in tagged unions). This is a shortcoming of the current flow analysis.
Nested Unique Pointers
Directly reading a unique pointer is only allowed along a unique path. A unique path u has the form
u ::= x ∣ u.m ∣ u->m ∣ *u
where x is a local variable, and u is a unique pointer. To appreciate the unique-path restriction, consider this incorrect code:
int f(int *@aqual(UNIQUE) *`r x) {
int *@aqual(UNIQUE) *`r y = x; //x and y are aliases
int *@aqual(UNIQUE) z = *y;
ufree(z);
return **x; //accesses deallocated storage!
}
Here, x
is a pointer into a conventional region `r
and thus its value
can be freely copied to y
. We then extract a unique pointer pointed to
by y
and free it. Then we attempt to access the deallocated storage
through x
.
If a unique pointer is not accessible via a unique path, it must be
swapped out atomically to be used; in Cyclone this is expressed with
syntax :=:
. In particular, the code a :=: b
will swap the contents
of a
and b
. We can use this to swap out a nested unique pointer,
and replace it with a different one; we will often swap in NULL
, since
this is a unique pointer that is always unconsumed. For example, in
the code below, we define a queue type for queues that contain unique
pointers, and a function take
for removing the first element from the
queue.
struct Queue<`a,`r> {
list_t<`a *@aqual(UNIQUE),`r> front;
list_t<`a *@aqual(UNIQUE),`r> rear;
};
typedef struct Queue<`a,`r> *`r queue_t<`a,`r>;
`a *@aqual(UNIQUE) take(queue_t<`a> q) {
if (q->front == NULL)
throw &Empty_val; // exception: def not shown
else {
let elem = NULL;
elem :=: q->front->hd;
q->front = q->front->tl;
if (q->front == NULL) q->rear = NULL;
return elem;
}
}
Here, in order to extract the element stored in the queue (the hd
portion of the underlying list), we need to use swap, because q->front
is a non-unique pointer, and therefore q->front->hd
is not a unique
path.
Note that this code is not as polymorphic as it could be. In
particular, the above queue definition requires its elements to be
nullable unique pointers, when they could just as easily be non-unique
pointers, or even reference-counted pointers (illustrated later), and
the code for take
would still work. This problem can be addressed, and
its solution is described in Qualifier
Polymorphism.
Pattern Matching on Unique Pointers
As described in Pattern Matching, Cyclone supports pattern
matching on structured data with let
declarations and switch
statements. Unique pointers, or structures containing unique pointers,
can be matched against, while still ensuring that only one legal
pointer to a unique object exists at any given time.
In the simplest case, when a unique pointer to a structure is matched against, the matching operation is treated just like a dereference. Therefore, the pointer itself is not consumed. For example:
struct pair { int x; int y; };
void foo() {
struct pair @@aqual(UNIQUE) p = new pair(1,2);
let &pair{.x=x, .y=y} = p;
ufree(p);
}
Here, we match against the unique pointer p
’s two fields x
and
y
. Because we don’t make a copy of p
, but rather only of its fields, p
is not consumed. Therefore, p
can be safely freed.
Because each of the fields matched against is assigned to the pattern variables, unique paths through the original pointer are consumed by virtue of being assigned. At the conclusion of the scope of the pattern, we can unconsume any location whose pattern variable has not been consumed or assigned to, as long as the parent pointer has not been consumed or assigned to. Here’s an example:
struct Foo { int *@aqual(UNIQUE) x; int *@aqual(UNIQUE) y; };
void foo(struct Foo *@aqual(UNIQUE) p) {
{ let &Foo{.x=x, .y=y} = p; // consumes p->x and p->y
ufree(x); // consumes x
} // p->y is unconsumed
ufree(p->y); // p->y consumed
ufree(p); // p consumed
}
The initial match against p
consumes p->x
and p->y
, whose contents are
copied to x
and y
, respectively. At the conclusion of the block, p->y
is unconsumed because it did not change, whereas p->x
is not, because
x
was freed within the block.
Note that the following code is illegal:
void foo(struct Foo *`H p) {
let &Foo{.x=x, .y=y} = p; // non-unique path!
...
}
To see why, notice that this is equivalent to
void foo(struct Foo *`H p) {
let x = p->x;
let y = p->y;
...
}
This code is illegal because neither p->x
nor p->y
is a unique
path. We also do not allow *
patterns to create aliases of the
original unique pointer, for the same reason we forbid &
e when e
is a unique pointer. Unfortunately, this means we don’t provide a way
to assign to matched-against fields. However, in the case of the
matched-against struct, we can just do this with regular paths. In the
above example pattern block, we could do p->y = new 1
or something
like that (even within the scope of the pattern).
Matching against tagged unions is essentially like matching against structures, as just described. Since we do not allow unique pointers to be stored within datatypes, there is no change to how datatypes are matched.
Reference-counted Pointers
Cyclone also supports reference-counted pointers, which are treated
quite similarly to unique pointers. Reference-counted objects may be
allocated in any region. We define the constant Core::refcnt_qual
,
having type aqual_t<REFCNT>
, for creating reference-counted
pointers. The caveat here is that when you allocate something in this
region, an extra word will be prepended to your data, which contains
the reference count, initialized to 1.
As with unique pointers, no pointer arithmetic is allowed, for similar
reasons: it can occlude where the “head” of the object is, and thus
make it impossible to find the hidden reference count. The reference
count can be accessed via the routine Core::refptr_count
:
int refptr_count(`a::TA ?@aqual(REFCNT) ptr);
The constant NULL
is allowed to have type `a::A ?@aqual(REFCNT)
, and
its reference count is always 0. Like unique pointers, implicit
aliasing is not allowed. Aliases are created explicitly using the
routine Core::alias_refptr
:
`a ?@aqual(REFCNT) alias_refptr(`a::TA ?@aqual(REFCNT) ptr);
This routine returns a copy of its argument, which is itself not
consumed. Furthermore, the reference count will be incremented by
one. Reference counts are reduced explicitly by the routine
drop_refptr
:
void drop_refptr(`a::TA ?@aqual(REFCNT) ptr)
__attribute((consume(1)));
In the case that the provided object’s reference count is 1 (and is thus dropped to zero), the provided pointer is freed. The flow analysis will consume the passed pointer (as is always the case for function arguments), so you won’t be able to use it afterwards. Just like unique pointers, you can “forget” reference-counted pointers without decrementing the count; this just means you’ll never be able to free the pointer explicitly, but the GC will get it once it becomes unreachable.
Just like unique pointers, reference-counted pointers can be stored in
normal, aliasable datastructures, and accessed using swap (e.g. x :=:
y
). Because NULL
is a `a::TA ?@aqual(REFCNT)
pointer, we can always
cheaply construct a pointer to swap in.
A good example of the use of unique pointers and reference-counted
pointers is in the Cyclone distribution’s tests
directory—the file
streambuff.cyc
. This is an implementation of a packet manipulation
library with a representation for packets (called streambuff_t
’s) that
is similar to Linux’s skbuff_t
’s. It uses a combination of unique
header structures and reference-counted data structures.
Qualifier Polymorphism
To allow the writing of reusable code we support both subtyping and
bounded polymorphism over alias qualifiers. Type variables that range
over the set of alias qualifiers are of kind Q
are used in addition to
the other kinds.
The list data structure in the Cyclone libraries illustrates many features of qualifier polymorphism. It has the following declaration:
struct List<`a::B,`r::R,`q::Q>{
: RESTRICTED >= aquals(`a), RESTRICTED >= `q
`a hd;
struct List<`a,`r,`q> *@aqual(`q) `r tl;
};
typedef struct List<`a,`r,`q> *@aqual(`q) `r list_t<`a,`r,`q>;
Here, the structure is parameterized by three type variables. The
first, of boxed-kind, admits instantiation by pointer types. Since
pointer types may be qualified according to their aliasability, we
require a bound on this aliasability. For this purpose, we use the
construct aquals(`a)
. Similar to the regions(`a)
construct,
aquals(`a)
evaluates to the top-level alias qualifier of the type that
instantiates the variable `a
. For instance, aquals(int
*@aqual(ALIASABLE) *@aqual(UNIQUE)) = UNIQUE
. The bound in the list
declaration RESTRICTED >= aquals(`a)
states that `a
can be
instantiated with a boxed kind with an aliasability that is a subtype
of RESTRICTED
. Since RESTRICTED
is at the top of the subtyping
hierarchy, this is the most general bound on the aliasability of the
type.
The second type variable is of region kind. These types may not be qualified by aliasability and thus do not appear in the bounds at all.
Finally, we have the type variable `q::Q
, of alias qualifier
kind. Thus it can be instantiated with any type that reduces to one of
the alias qualifiers. That is, one of ALIASABLE, UNIQUE, REFCNT,
RESTRICTED or aquals(`a)
. The bound on `q
also is the most general
bound.
A list that uses unique pointers on the “spine” with reference counted elements might be constructed as follows:
int *rc_int = qnew(Core::refcnt_qual) 0;
int *rc_int2 = qnew(Core::refcnt_qual) 1;
list_t<int*@aqual(REFCNT),`H,UNIQUE> l =
rnew(Core::heap_region, Core::unique_qual)
List{rc_int,
rnew(Core::heap_region, Core::unique_qual)
List{rc_int2, NULL}};
We can also quantify over alias qualifiers in function types. For instance, a function that copies a list can be defined as follows.
list_t<`a,`r,`q> rqcopy(region_t<`r> r,aqual_t<`q> q,
list_t<`a,`r2,`p> l
: RESTRICTED >= `q,
ALIASABLE >= aquals(`a),
RESTRICTED >= `p) {
if(l == NULL)
return NULL;
_ tl = NULL;
tl :=: l->tl;
list_t<`a,`r,`q> result = rnew(r,q) List{l->hd, rqcopy(r,q,tl)};
l->tl :=: tl;
return result;
}
This function copies a list allocated in a region r2 into a region
r. As previously, the function is polymorphic in both the source and
destination regions. Furthermore, the aliasability of the new list is
specified by the handle q which has type aqual_t<`q>
. As previously,
since `q
is of alias qualifier kind, a bound can be specified for it —
this appears after the argument list along with any effects for this
function. Note that the list that is being copied may also have a
spine that is RESTRICTED as specified by the RESTRICTED >= `p
bound. This bound requires that we use the swap operator (:=:) to
ensure that no aliases are manufactured. Finally, since we are copying
the list, the elements of the list itself must be aliasable — this is
specified by the ALIASABLE >= aquals(`a)
bound.
Aliasing Unique Pointers
Programmers often write code that aliases values temporarily, e.g. by storing them in loop iterator variables or by passing them to functions. Such reasonable uses would be severely hampered by “no alias” restrictions on unique pointers. To address this problem, we introduce a special alias pattern variable that permits temporary aliasing of a unique pointer. Here is a simple example:
char *@fat@aqual(UNIQUE) dst, *@fat@aqual(UNIQUE) src = ...
{ let alias <`r>char *@fat`r x = src; // consumes src
memcpy(dst,x,numelts(x)); }
// src unconsumed
...
ufree(src);
In general, an alias pattern has form alias <`r>_t_ x
, where `r
is a
fresh region variable, and t is the type of x, which may mention
`r
. The alias pattern introduces a region `r
, copies src
to x
which is treated as having the designated type char *@fat`r
.
Because `r
is non-unique, x
can be freely aliased. As such, we can
pass x
to the memcpy
function. The matching operation consumes src
during the block, and unconsumes it upon exit, so that src
can be
ultimately freed.
Alias pattern variables are similar to regular pattern variables. Like
regular pattern variables, the matched-against expression (i.e. src in
the above example) must be a unique path, and is consumed as a result
of the match. As well, this expression can be unconsumed at the
conclusion of the surrounding block as long as it hasn’t been
overwritten. However, in the case of regular pattern variables,
unconsumption also requires that the pattern variable itself (i.e. x
in the above example) hasn’t changed within the block, while this
requirement is unnecessary for alias patterns.
Intuitively, alias pattern variables are sound because we cast a unique pointer to instead point into a fresh region, for which there is no possibility of either creating new values or storing existing values into escaping data structures. As such we cannot create aliases that persist beyond the surrounding scope. However, we must take care when aliasing data having recursive type. For example, the following code is unsound:
void foo(list_t<`a, `r1, UNIQUE> l) {
let alias <`r> x = (list_t<`a, `r1, UNIQUE>)l;
x->tl = x; // unsound: creates alias!
}
In this case, the alias effectively created many values in the fresh
region `r
: one for each element of the list. This allows storing an
alias in an element reachable from the original expression l, so that
when the block is exited, this alias escapes.
For improved programmer convenience, the Cyclone typechecker optimistically inserts alias blocks around function-call arguments that are unique pointers when the formal-parameter type is polymorphic in the pointer’s region. If this modified call does not type-check, we remove the inserted alias. For example, the alias pattern in the foo function above could be inferred, so we could instead write:
int foo() {
list_t<int,`H,UNIQUE> l = new List(1,new List(2,NULL));
return length(l);
}
Right now, alias inference in Cyclone is fairly primitive, but could be extended to more contexts. We plan to improve this feature in future releases.
Dynamic Regions
Dynamic regions combine reference-counted or unique pointers and
lexical regions together to essentially create reference-counted or
unique regions; that is, the region is completely first class, and
can be created or freed at conceptually any program point. This is
done by representing a dynamic region as a unique (or
reference-counted) pointer to an abstract struct DynamicRegion
(which
internally just contains the handle to a lexical region). The unique
(or ref-counted) pointer is called the key. The key serves as a
run-time capability that grants access to the region. At run-time, a
key can be presented to a special open
primitive, described later,
that grants lexical access to the region.
The operation new_ukey()
creates a fresh dynamic region and returns a
unique key for the region; new_rckey()
creates a fresh dynamic region
and returns a ref-counted key for the region. The operations
free_ukey()
and free_rckey()
are used to destroy unique and
ref-counted keys respectively. The free_ukey()
operation reclaims the
key’s region, as well as the storage for the key. The free_rckey()
operation decrements the reference count, and if it’s zero, reclaims
the key’s region as well as the storage for the key. Because
ref-counted keys are pointers, you can use alias_refptr
to make a copy
of a ref-counted key. (Obviously, you can’t make a copy of a unique
key.) By the same token, you can pass a ref-counted key to drop_refptr
(and you can pass a unique key to ufree
), but doing so won’t actually
deallocate the region, but rather only the key. To obtain a dynamic
reap (i.e. a dynamic region that supports the deletion of individual
objects) use new_reap_ukey
and new_reap_rckey
instead.
Given a key k, a user can access the contents of its region by
temporarily “opening the region” within a lexical scope. This is done
with the syntax region
r = open
k. That is, within the
remainder of the current scope, the region handle r can be used to
access k’s region. The key k is temporarily consumed throughout
the scope, and then unconsumed at its conclusion. This prevents you
from opening up the dynamic region, and then freeing it while it’s
still in use. Note that open is very similar to alias in this way.
Here is a small example of the use of dynamic regions.
int main() {
// Create a new dynamic region
let NewDynamicRegion{<`r> key} = new_ukey();
// At this point, we refer to the region `r to
// specify types, but we cannot actually access
// `r (i.e. it's not in our "static capability,"
// a concept explained later)
list_t<int,`r> x = NULL;
// We can access x by opening the region, which
// temporarily consumes the key
{ region h = open(key);
x = rnew(h) List(3,x);
}
// Now we can access the key again, but not x.
// So we have to open the region to increment
// its contents
{ region h = open(key);
int i = x->hd + 1;
x = rnew (h) List(i,x);
}
// Finally, destroy the key and the region
free_ukey(key);
}
First, we allocate a new unique key and open it up, to reveal the name
of the key’s region (`r
), and the key itself. Because `r
is now in
scope, we can declare a variable x
that refers to it. However, because
the key key must be opened before `r
becomes accessible, we cannot
actually do anything with x
yet (like dereference it).
Next, we open up the region using key
, assigning its handle to the
vairable h
. Now, key
is inaccessible (consumed) in the surrounding
block, which prevents us from doing anything that might cause it to be
freed while it’s in use. We can use h
to allocate into `r
, so we
allocate a list element and store it in x
.
At the conclusion of the block, the region `r
becomes inaccessible
again, so once again we cannot dereference x
. However, key
can now be
accessed again, so we can open it again in the following block, to add
a new list cell to x
. At the conclusion of this block, key
is
unconsumed once again, so we legally call free_ukey
. This frees the
key and the region `r
.
You can “share” a dynamic region key by placing it in some shared data structure, like a global variable. Of course, you’ll then have to swap with NULL to get it in and out of the shared data structure, as the following code demonstrates:
struct MyContainer { <`r>
uregion_key_t<`r> key;
list_t<int,`r> data;
} *\U `H global = NULL;
int main() {
// allocate a dynamic region, and create a list
let NewDynamicRegion{<`r> key} = new_ukey();
list_t<int,`r> x = NULL;
{ region h = open(key);
x = rnew(h) List(3,x);
}
// Stick the key and list in a global data
// structure. We've now lost direct access to
// the key and x.
global = new MyContainer{key,x};
// But we can regain it by swapping for the
// container.
struct MyContainer *@aqual(UNIQUE) p = NULL;
global :=: p;
// Now we can use it as above
let MyContainer{<`r2> key2, data2} = *p;
list_t<int,`r2> d = data2;
{ region h = open(key2);
int i = d->hd + 1;
d = rnew (h) List(i,d);
}
}
Here, we define a global variable having type MyContainer, which consists of a key and some data into that key’s region. The main function allocates a unique as before, and allocates some data into its region. Then we create a container for that key and data, and store it into the global variable; this consumes key, making it inaccessible, and effectively preventing access of x as well.
But we can then get the container back out of the global variable by swapping its contents with NULL. Then we can open up the container, and use the key and data as before. This way, a single dynamic region can be used by many different functions in the program. They simply swap out the global when they need it, operate on it, and then swap in the result.
One problem with using this technique with unique keys arises when you need to open the same region multiple times. The problem, of course, is that if you swap in NULL, then whoever tries to swap it out will fail. In other words, you can’t really do recursive opens with UNIQUE keys. However, you can do this with REFCNT keys! Swap out the key, make a copy of it, swap it back in, and use the copy for the open (making sure to destroy the copy after the open).
One disadvantage of dynamic regions, which is inherited from unique
and reference-counted pointers, is that if you put a key in some
shared storage in a region r, then it is not the case that when
r is
deallocated that the key will be destroyed automatically. It’s up to
you to do the right thing or let the GC eventually collect it. In the
long run, the right thing to do is add a finalizer interface for
regions so that we can register a routine to deallocate a dynamic
region whenever we put it in a shared data structure. The same goes
for any unique pointer – we ought to have a way to register a
finalizer. This is on our To-do list.
Defaults and Shorthands
As described so far, the notation for alias qualifiers is extremely verbose. The default qualifier annotations and bounds are intended to capture the most common cases and reduce the burden on the programmer. Where the defaults do not suffice, shorthand versions allow explicit types to specified in a more compact manner.
The shorthand notation is as follows:
Alias Qualifier Specifier The strings \A, \U, \RC, \T
can be
used as substitutes for @aqual(ALIASABLE)
, @aqual(UNIQUE)
,
@aqual(REFCNT)
, and @aqual(RESTRICTED)
respectively. For example
void cons(int *\T a, int *\U b)
is equivalent to
void cmp(int *@aqual(RESTRICTED) a, int *@aqual(UNIQUE) b)
Type Variable Bound The bound on a type variable, as shown
previously, generally appears together with the effects in a function
type, or with the outlives relations for an aggregate type. However,
when the strings \A, \U, \RC, \T
succeed a type variable they are
interpreted as a bound on the type variable. This bound is only legal
within an struct declaration or a function type, i.e., bounds may not
appear within a local variable declaration, typedefs etc. If the type
variable is of alias qualifier kind (kind Q) then this is interpreted
as a qualifier bound. If the variable is of boxed, memory or abstract
kind (::B, ::M, or ::A) then it is interpreted as an aquals bound. For
instance, the function rqcopy above can be written as
list_t<`a,`r,`q> rqcopy(region_t<`r> r,aqual_t<`q\T> q,
list_t<`a\A,`r2,`p\T> l);
The same convention applies to aggregate types. For instance, we could define List as follows
struct List<`a::B\T,`r::R,`q::Q\T>{
`a hd;
struct List<`a,`r,`q> *@aqual(`q) `r tl;
};
Note that if a type variable is used to specify the alias qualifier on
a pointer, then the @aqual(.) notation must be used. That is,
int *@aqual(`q\T)
cannot be substituted with int *`q\T
.
Heap Pointers For unique and reference counted pointers that
reside in the heap we overload the region notation to provide a
further shorthand. This is also for backward compatibility with
previous versions of Cyclone where unique and reference counted
pointers always resided in a separate region. Thus the type
T *@aqual(\U) `H
can be written more compactly as T *`U
; the
type T *@aqual(\RC) `H
can be written as T *`RC
. The default
qualifier bounds are as follows:
Function Types The aliasability of all parameters and return types
in a function type is ALIASABLE by default. If a formal parameter is
declared as T *@aqual(\T)
then all subtypes of T
*@aqual(RESTRICTED) may be passed as an argument. For parametricity
T *@aqual(`q\T)
should be used instead.
Struct Types For pointers within a struct the default aliasability is ALIASABLE. For qualifier variables `q::Q the default bound is RESTRICTED. The aquals bounds for type variables is RESTRICTED by default. Thus the following declarations are equivalent
struct PtrList<`a::A, `q> {
`a *hd;
struct PtrList<`a,`q> *@aqual(`q) tl;
};
struct PtrList<`a::A\T,`q\T> {
`a *@aqual(\A) hd;
struct PtrList<`a,`q> *@aqual(`q) tl;
};
Type Instantiations When a type variable is instantiated with a pointer type we have to decide the aliasability of pointer. When instantiated in a function type or an aggregate type the aliasability is by default ALIASABLE. For instance, the following are equivalent:
void int_list(list_t<int*> l);
void int_list(list_t<int *@aqual(\A)> l);
struct Wrapper {
list_t<int*> l;
}
struct Wrapper {
list_t<int*@aqual(ALIASABLE)> l;
}
In variable declarations the default is RESTRICTED to allow for more aggressive unification. For instance
void int_list(list_t<int*@aqual(\T)> l) {
list_t<int*> cp = l;
}
In the local variable declaration the bound defaults to RESTRICTED so that unification with the formal succeeds.
Subtyping for Effect Qualifier and Reaps
We mentioned in Allocation and Freeing that there was more to the type of Core::rufree. The full type of the function is given below.
void rufree(region_t<`r> h, `a::A\T *\U `r ptr
: single(`r)) __attribute__((noliveunique(2)));
This type states that rufree
expects a handle into region `r
as its
first argument. The `r
effect qualifier on the second argument is
meant to indicate that ptr is a pointer into the region `r
, the handle
of which is the first argument. The \U
qualifier indicates that ptr
must be a unique pointer into that region. The type of the element
that is pointed to by ptr can be anything at all (`a::A
), and may
itself contain unique or reference-counted pointers, as indicated by
the qualifier bound \T
.
The attribute noliveunique
that appears in the type is a variant of
the consume
attribute shown earlier. This attribute states that
after the call to the rufree, the pointer passed as the second
parameter is consumed. Furthermore, if the element pointed to by ptr
contains a live unique pointer, then Cyclone issues a warning. The
reason is that since ptr is the only pointer to the underlying
element, if this element contains a unique or reference-counted
pointer, then the last reference to those pointers are lost leading to
a memory leak.
The last component of the type of rufree is the single(`r)
constraint. The effect qualifier annotation of `r
on both the
handle as well as the pointer is meant to force the pointer to point
into the region of the provided handle. However, in the presence of
subtyping over effect qualifiers (as described in
The Truth About Effects, Capabilities and Effect Subset Constraints) the annotations on the parameters alone
don’t suffice to enforce this invariant. This is illustrated by
the example below.
void bad_rufree(region_t<`r>, `a *\U `r);
void foo() {
region r;
int *`r ptr = rnew(r) 1;
region_t<`r+`H> a = Core::heap_region;
bad_rufree(a, ptr);
}
Subtyping of effect qualifiers means that it is permissble to treat
int *`r
as int*`r+`H
and so the call to bad_rufree
typechecks even though pointer ptr does not point into the region for
which a is the handle.
To avoid this behavior, the single(`r)
constraint is used to limit
subtyping on the effect qualifier variable `r.
In particular, an
effect qualifier variable that appears in a single(`r)
constraint
is guaranteed to always be instantiated with a single region name,
and not a set of region names.
Reap Allocator Implementation
To support deallocation of objects within a region the allocator must maintain some meta-data associated with each object. The current implementation uses a version of the bget allocator adapted for use with reaps. When deallocation is rare, bget behaves much like a simple pointer bumping allocator and we expect performance to be competitive. However, the metadata does consume two additional header words for each object allocated.
For normal regions (i.e. those declared as region r, or those created
using new_ukey
, or new_rckey
) the simple pointer-bumping allocator
is used. Thus, if your application does not use reaps at all, then the
overhead due to bget therefore does not apply.